AITopics | memory-efficient backpropagation

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing SystemsDec-26-2025, 02:12:35 GMT

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by $36\%\sim81\%$, which outperforms the reduction achieved by other methods.

graph theoretic framework, memory-efficient backpropagation, recomputation algorithm, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.62)

Add feedback

Memory-Efficient Backpropagation Through Time

Neural Information Processing SystemsNov-21-2025, 15:11:39 GMT

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes. The algorithm is particularly effective for long sequences. For sequences of length 1000, our algorithm saves 95\% of memory usage while using only one third more time per iteration than the standard BPTT.

algorithm, memory-efficient backpropagation, name change, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing SystemsJan-27-2025, 18:23:37 GMT

The paper proposes a method that reduces memory consumption of back prop, and as a result allows the use of larger batch sizes. The authors state that this is significant e.g. with batch norm, where batch size matters. In my own experience, I believe this is also significant for improved GPU utilization and data-parallel training. The paper is original in that I have not seen a similar treatment (although the actual solution may have been relatively straight-forward; asking the right question was an important part of this). The paper is well written and easy to follow.

graph theoretic framework, memory-efficient backpropagation, recomputation algorithm, (2 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing SystemsJan-27-2025, 18:23:27 GMT

This work formalizes the problem of minimizing memory consumption through recomputation when performing a forward-backprop evaluation of a computation graph; it provides an optimal dynamic programming algorithm and an efficient heuristic and demonstrates strong improvements in memory savings over existing methods. Three expert reviewers initially assess the paper as 8/8/5, and the authors provided a detailed rebuttal. All reviewers took part in a discussion and the final assessment is 8/8/6, with reviewer consensus that this method is practically useful and the reported gains are strong. Overall this work is a nice contribution.

graph theoretic framework, memory-efficient backpropagation, recomputation algorithm

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

Reviews: Memory-Efficient Backpropagation Through Time

Neural Information Processing SystemsJan-20-2025, 17:04:25 GMT

The authors are solving an important problem. RNN training procedures can be greedy for memory. And, given the sequential nature, it's not trivial to simply to scale the training of each sequence over many machines. As a result, it's important to judiciously use memory and computational resources to train RNNs efficiently. I'm pleased to see the authors not only proposing a new instance of a solution, but to provide a user-selectable tradeoff between the quantity of computation and the memory usage.

memory-efficient backpropagation, rnn

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.40)

Add feedback

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Neural Information Processing SystemsOct-11-2024, 03:48:50 GMT

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by 36\%\sim81\%, which outperforms the reduction achieved by other methods.

graph theoretic framework, memory-efficient backpropagation, recomputation algorithm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.67)

Add feedback

Memory-Efficient Backpropagation through Large Linear Layers

Bershatsky, Daniel, Mikhalev, Aleksandr, Katrutsa, Alexandr, Gusak, Julia, Merkulov, Daniil, Oseledets, Ivan

arXiv.org Machine LearningFeb-2-2022

In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass. This study proposes a memory reduction approach to perform backpropagation through linear layers. Since the gradients of linear layers are computed by matrix multiplications, we consider methods for randomized matrix multiplications and demonstrate that they require less memory with a moderate decrease of the test accuracy. Also, we investigate the variance of the gradient estimate induced by the randomized matrix multiplication. We compare this variance with the variance coming from gradient estimation based on the batch of samples. We demonstrate the benefits of the proposed method on the fine-tuning of the pre-trained RoBERTa model on GLUE tasks.

memory-efficient backpropagation, multiplication, variance, (15 more...)

arXiv.org Machine Learning

2201.13195

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Memory-Efficient Backpropagation Through Time

Gruslys, Audrunas, Munos, Remi, Danihelka, Ivo, Lanctot, Marc, Graves, Alex

Neural Information Processing SystemsMar-19-2020, 14:02:18 GMT

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of intermediate results and recomputation. The algorithm is capable of tightly fitting within almost any user-set memory budget while finding an optimal execution policy minimizing the computational cost. Computational devices have limited memory capacity and maximizing a computational performance given a fixed memory budget is a practical use-case. We provide asymptotic computational upper bounds for various regimes.

algorithm, memory budget, memory-efficient backpropagation, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.67)

Add feedback

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Kusumoto, Mitsuru, Inoue, Takuya, Watanabe, Gentaro, Akiba, Takuya, Koyama, Masanori

Neural Information Processing SystemsMar-18-2020, 20:47:14 GMT

Recomputation algorithms collectively refer to a family of methods that aims to reduce the memory consumption of the backpropagation by selectively discarding the intermediate results of the forward propagation and recomputing the discarded results as needed. In this paper, we will propose a novel and efficient recomputation method that can be applied to a wider range of neural nets than previous methods. We use the language of graph theory to formalize the general recomputation problem of minimizing the computational overhead under a fixed memory budget constraint, and provide a dynamic programming solution to the problem. Our method can reduce the peak memory consumption on various benchmark networks by $36\%\sim81\%$, which outperforms the reduction achieved by other methods. Papers published at the Neural Information Processing Systems Conference.

graph theoretic framework, memory-efficient backpropagation, recomputation algorithm, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.67)

Add feedback

Filters

Collaborating Authors

memory-efficient backpropagation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Memory-Efficient Backpropagation Through Time

Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Reviews: A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Reviews: Memory-Efficient Backpropagation Through Time

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation

Memory-Efficient Backpropagation through Large Linear Layers

Memory-Efficient Backpropagation Through Time

A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation